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SUMMARY 

\ 

This  technical  note  shows,  largely  by  means  of  a fairly  complex 
example,  how  realistic  Fleet  Defense  problems  may  be  modeled  using  game 
theory.  Emphasis  is  placed  on  the  structuring  of  the  tactical  problem 
as  opposed  to  obtaining  its  game  theoretic  solution;  given  the  paucity 
of  realistic  game  solutions  in  use  by  the  Navy  this  emphasis  is  felt  to 
be  justified.  ^ For  the  most  part,  game  theory  technique  per  se  is  of 
secondary  interest  at  this  stage  of  development- -standard  two-person, 
zero  sum  (matrix)  games  will  often  suffice.  However,  there  is  one 
important  exception.  Standard  theory  is  not  readily  available  for 
handling  uncertainty,  and  this  topic,  which  is  of  both  practical  and 
theoretical  interest  is  considered  at  some  length.  Constrained  matrix 
games  are  needed  to  solve  games  with  uncertainty. 

An  overview  of  the  document  follows.  The  need  for  a game  theoretic 
treatment  of  Fleet  Defense  problems  is  stated  and  examples  of  the  kinds 
of  questions  that  may  benefit  from  an  appropriate  game  theory  analysis 
are  given.  3The  overall  approach  is  then  outlined;  its  essential  ideas 
are  to: 


llse  a scenario  approach  in  order  to  have  a definite 
problem  context. 

(2)  Develop  an  effectiveness  model  from  the  scenario  in 

such  a way  that  the  gaming  methodology  can  be  readily 

interfaced  with  it. 

; 

,^3)  Use  a flexible  effectiveness  measure  as  a part  of  this 
model. 

(4)  Decompose  the  overall  problem  and  develop  an  algorithm 


for  solving  the  overall  game. 
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(5)  Start  with  perfect  information  and  then  introduce 
imperfect  information  (uncertainty). 

(6y  Solve  the  many  lower-order  games  by  whatever 
methods  are  appropriate.  $ 

(7)  Use  the  game  solutions  to  study  tactical  decisions 
and  tradeoffs  of  all  kinds. 

A Fleet  Defense  scenario  is  then  defined,  first  by  a narrative  and 
then  by  a more  precise  event-flow  diagram.  The  scenario  involves  a 
Carrier  Task  Force  (the  Blue  player)  in  transit  within  range  of  an  enemy 
(Red  player)  land  base.  Red  reconnaissance  aircraft  search  for  the  CV 
and,  upon  detection,  call  in  attack  aircraft  carrying  cruise  missiles  to 
attack  the  CV.  CAP,  DLI , and  AEW  aircraft  are  represented  and  comprise 
the  major  elements  of  the  defense.  Counterdetection  and  interception  of 
the  reconnaissance  aircraft  are  possible,  as  are  the  detection  of  the 
raid  by  AEW  and  subsequent  interception  by  CAP  or  DLI.  If  the  recon- 
naissance aircraft  is  lost  to  Red,  the  raiders  are  forced  to  perform 
their  own  search  for  the  CV.  The  last  layer  of  Blue  defense  is  a Point 
Defense  system.  Passive/active  decisions  are  an  important  part  of  the 
scenario  for  both  players. 

The  next  step  is  the  crucial  one:  to  translate  the  event-flow 
diagram  into  a form  of  effectiveness  model  that  is  compatible  with  a 
game  theory  solution.  The  concept  is  to  define  a model  which,  when  all 
decision  variables  are  fixed,  is  a Markov  chain  model  with  absorbing 
states . 

States  and  state  transitions  of  the  model  are  first  defined  in  such 
a way  that  the  flow  of  scenario  events  is  adequately  represented,  with 
the  transition  probabilities  being  in  general  functions  of  the  decision 
variables  of  the  two  players.  Four  kinds  of  states  are  distinguished, 
depending  upon  the  degree  of  control  the  two  players  have  over  the  values 
of  the  outgoing  transition  probabilities.  Some  states  will  have  transi- 
tion probabilities  independent  of  both  players'  decision  variables; 
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these  are  chance  states.  Other  states  will  have  transition  probabilities 
controllable  by  just  one  player,  and  these  define  one-sided  optimization 
problems.  The  more  interesting  and  vital  states  have  transition  probabil- 
ities under  joint  control  of  the  two  players;  these  are  game  states. 

Several  terminal  (absorbing)  states  are  defined  in  terms  of  the 
terminal  conditions  of  the  scenario.  By  associating  an  input  value 
(actually  a utility)  with  each  absorbing  state,  the  effectiveness  measure 
is  defined  as  the  average  value  (or  average  utility)  over  a large  number 
of  replications  of  the  model.  (Put  differently,  if  the  Markov  chain  model 
were  replicated  a large  number  of  times  in  the  Monte  Carlo  manner,  a prob- 
ability distribution  over  the  absorbing  states  would  be  determined.  This 
distribution  would  be  used  to  weight  the  input  utility  values  to  form  the 
effectiveness  measure.)  In  game  theoretic  terms  this  measure  is  defined 
to  be  the  payoff  associated  with  the  decision  variables,  and  the  relation- 
ship of  the  payoff  to  the  decision  variables  is  the  payoff  function.  The 
min-max  game  is  played  with  this  function;  Blue  wants  to  maximize  this 
function  and  Red  wants  to  minimize  it. 

Dynamic  programming  has  been  selected  as  the  natural  method  for 
solving  the  overall  game.  This  method  works  backwards  from  what  is 
already  known  (or  evaluated),  which  in  this  application  means  initially 
working  backwards  from  the  absorbing  states  with  their  input  values.  The 
dynamic  programming  process  "rolls  back"  from  evaluated  states  to  uneval- 
uated states.  At  each  state  a mathematical  problem  whose  type  depends 
on  the  type  of  state  is  presented.  At  game  states  a two-person,  zero-sum 
game  is  solved,  in  the  absence  of  uncertainty  an  ordinary  matrix  game 
will  often  suffice.  A detailed  example  of  a model  at  a game  state  is 
given. 

The  last  part  of  the  document  is  devoted  to  uncertainty.  A model 
is  given  which  uses  the  concept  of  a lottery;  the  game  to  be  played  is 
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chosen  by  a lottery  whose  probabilistic  functioning  both  players  know. 
The  text  has  an  intuitive  argument  to  justify  the  assumption  of  this 
kind  of  uncertainty  model.  By  this  device  a game  with  uncertainty  can 
be  handled  as  a game  with  perfect  information,  but  the  information  now 
has  to  do  with  the  probability  distribution  behind  the  lottery  and  not 
tactical  information  itself.  The  detailed  game-state  model  mentioned 
above  is  reformulated  with  uncertainty,  and  a constrained  matrix  game 
is  given  for  its  solution.  Linear  programming  may  be  used  for  the 
numerical  solution  of  this  lottery  game. 

There  are  two  appendices:  Appendix  A further  develops  the  geometry 
of  the  Markov  model,  and  Appendix  B elaborates  on  the  uncertainty  model 
and  gives  a numerical  example. 
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I INTRODUCTION 


: 


This  technical  note  contains  results  obtained  in  the  first  phase  of 
a project  whose  overall  objective  is  the  application  of  game  theory 
methods  to  realistic  Fleet  Defense  problems.  The  document  is  primarily 
concerned  with  the  difficult  and  fundamental  problems  of  structuring  a 
complex  problem  so  that  game  theory  methods  can  be  used  to  obtain 
solutions  at  a reasonable  expenditure  of  time  and  effort. 


There  are  at  least  three  reasons  for  applying  game  theory  to  Fleet 
Defense  problems.  First,  analytical  solu  ions  are  badly  needed  to  help 
understand  and  resolve  the  maze  of  tradeoffs  that  are  involved  in  Fleet 
Defense  at  the  tactical  level.  These  complex  problems  have  been  found 
tractable  only  by  Monte  Carlo  simulation.  Such  methods  have  their  merits 
but  they  do  not  provide  a theoretically  justifiable  framework  for  re- 
solving the  many  tactical  choices  and  tactical  and  equipment  tradeoffs 
which  exist.  Optimization  methods  generally  and  game  theory  in  partic- 
ular can,  when  properly  used,  provide  structure  to  complex,  hard-to- 
define  real  problems.  Second,  analytical  methods  used  on  this  class  of 
problems  are  usually  one -sided --the  opponent's  tactics  and  strategy 
being  established  by  assumption  and  not  analysis.  Any  two-sided  analysis 
which  has  been  done  tends  to  be  ad  hoc  and  the  methods  are  not  readily 
transferable  to  other  problems.  Finally,  game  theory  makes  it  possible 
to  disentangle  the  contributions  of  equipment  and  tactics.  Specifically, 
because  the  tactics  are  optimized,  any  comparison  of  effectiveness 
derived  for  differing  equipment  (or  equipment  parameters)  will  more 
truly  reflect  the  differences  in  value  of  the  equipment  itself. 
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This  research  effort  has  an  ambitious  goal:  to  bring  ordinary 
zero-sum,  two-person  game  theory  to  bear  on  practical  Fleet  Defense 
problems  in  such  a way  as  to  be  of  interest  and  use  to  those  who  make 
Navy  tactical  decisions  and  to  the  naval  analysis  community. 

Some  of  the  kinds  of  questions  which  can  be  examined  using  the 
methodology  to  be  discussed  are: 

• Whether  to  have  CAP  aircraft  or  not,  and  if  so, 
how  many  and  where? 

• How  should  available  AEW  aircraft  be  used? 

• How  should  ships  of  a task  force  be  deployed? 

(Close  formation  vs.  dispersed  formation  questions.) 

• Hard  kill  vs.  soft  kill  questions. 

• When  should  EMCON  be  used,  and  what  should  be  the 
conditions  for  breaking  it? 

• How  should  CAP  surface-to-air  missile  coordination 
problems  be  resolved? 

• In  a multiple-threat  environment,  how  should  the  defense 
balance  its  forces? 

The  primary  outputs  of  games  solved  using  the  methodology  are  the 
optimal  values  of  the  decision  variables  for  the  scenario.  Many  of  these 
are  probabilities,  e.g.,  the  probability  that  CAP  should  be  used,  the 
probability  that  AEW  aircraft  should  be  used,  and  the  probability  that 
the  CV  should  be  initially  in  EMCON.  As  special  cases,  these  probabil- 
ities are  often  expected  to  be  zero  or  one.  When  this  is  the  case,  the 
decision  variables  are  interpreted  as  whether  or  not  to  use  CAP,  to  use 
AEW,  or  to  be  initially  in  EMCON. 

Not  all  decision  variables  are  probabilities,  however.  For  example, 
in  a more  advanced  model  than  presented  here  the  range  and  bearing  of  a 
CAP  station  would  be  decision  variables  and  their  optimal  values  would 
indicate  where  CAP  should  be  positioned. 
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Because  the  effectiveness  model  is  designed  to  stand  on  its  own 
when  all  decision  variables  are  fixed,  one  also  has  the  effectiveness 
model  at  the  optimal  point  to  exercise.  Many  of  the  usual  things  that 


are  done  with  effectiveness  models  can  be  done  with  this  model  also, 
providing  care  is  taken  to  interpret  optimality.  To  illustrate  this,  we 
point  out  that  sensitivity  analysis  performed  at  the  optimal  point  by 
varying  some  parameter  which  is  not  a decision  variable  will  in  general 
be  nonoptimal.  Such  sensitivity  results  may  be  adequate  for  many 
purposes,  however. 
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II  A FLEET  DEFENSE  SCENARIO 


This  section  gives  a verbal  sketch  of  the  fleet  defense  problem  to 
be  modeled  and  solved  using  game  theory.  Although  its  structure  is 
fairly  simple,  it  contains  a number  of  the  essential  aspects  of  the  real 
tactical  problem.  The  problem  is  more  fully  defined  and  occasionally 
simplified  in  later  sections.  Following  gaming  tradition  we  label  one 
player  (or  side)  Blue  and  the  other  Red;  we  arbitrarily  choose  Blue  to 
have  the  Fleet  Defense  problem  and  Red  the  search  and  attack  problem. 

Consider  a Blue  Carrier  Task  Force  (CTF)  in  transit,  passing  within 
range  of  a Red  land  base  capable  of  supporting  attack  aircraft  (see 
Figure  1 for  the  geometry).  Blue  wants  to  complete  the  transit  without 
loss  or  damage,  while  Red  wants  to  sink  or  at  least  damage  the  CV.  (The 
larger  scenario  in  which  this  problem  is  imbedded  must  be  translated 
into  the  payoff  structure  of  this  model  by  means  to  be  explained  later.) 

The  CV  may  choose  to  deploy  Combat  Air  Patrol  (CAP)  aircraft  for 
search,  investigation,  and  intercept.  (By  CAP  is  meant  a combination  of 
fighter  and  Airborne  Early  Warning  (AEW)  aircraft.)  The  AEW  aircraft 
may  search  either  actively  or  passively.  The  CV  has  several  important 
tactical  choices  to  make  at  the  outset,  one  is  whether  to  be  in  EMCON  or 
to  be  active.  If  initially  in  EMCON  the  CV  has  opportunities  at  later 
critical  times  to  "go  active,"  which  means  essentially  that  the  CV  turns 
its  radars  on. 

A Red  reconnaissance  aircraft  ("Recon")  flies  along  the  path  shown 
in  the  figure,  searching  for  the  Blue  force.  Recon  may  detect  either 
the  CV  or  one  of  the  CAP  aircraft,  and,  upon  classifying  the  CV  to  a 


Airfield 


Red  attack  aircraft 
with  cruise  missiles 


Launch  missiles 


Blue  CV 


Recon  detects  CV  or  CAP,  calls  in  attack  aircraft. 

If  TF  detects  Recon,  may  try  to  intercept  before  detection 
of  CV  or  launch  of  cruise  missiles. 


FIGURE  1 A SIMPLE  FLEET  DEFENSE  PROBLEM 


sufficient  level  of  confidence,  calls  in  the  attack  aircraft  for  an 
attack  on  the  CV.  (Search  by  Recon  can  be  in  either  the  active  or 
passive  mode.) 

The  attack  aircraft  then  begin  flying  out  towards  the  CV,  using 
position  information  provided  by  Recon.  Since  some  time  is  required  for 
the  flyout,  and  since  in  any  case  the  position  information  is  imperfect, 
the  attack  aircraft  need  further  assistance  in  locating  the  CV.  Knowing 
this,  the  Blue  force  may  attempt  to  destroy  the  Recon  before  the  attack 
aircraft  launch  their  weapons  (cruise  missiles)  at  the  CV.  (For  the 
geometry  shown  the  interception  would  probably  be  performed  by  CAP 
fighters.  Recon  has  orders  which  govern  his  behavior  in  the  event  of 
attack  by  Blue--he  may  either  flee  if  the  situation  warrants  it  or  be 
forced  to  stay.) 

AEW  aircraft  are  in  the  meantime  searching,  and  may  detect  the 
incoming  raid  even  at  low  altitude.  Detection  may  permit  interception 
by  CAP  or  Deck  Launched  Interceptors  (DLI)  to  counter  the  raid. 

If  the  Recon  aircraft  should  be  shot  down  (or  driven  from  the 
scene)  while  the  aircraft  in  the  raid  still  need  it  as  an  information 
source,  one  or  more  raiders  may  choose  to  "pop-up"  from  their  assumed 
low  altitude  profile  to  search  for  the  CV.  (The  raid  will  be  flying  at 
low  altitude  in  the  last  phase  of  the  run-in  in  order  to  avoid  detection 
by  the  CV's  radars.)  The  pop-up  may  or  may  not  result  in  detection, 
when  detection  is  not  obtained  the  raid  has  no  choice  but  to  return 
home  empty  handed.  If  detection  is  obtained  during  the  pop-up,  the 
raiders  are  assumed  to  have  enough  CV  information  to  be  able  to  launch 
their  cruise  missiles. 

The  pop-up  is  not  without  its  dangers  for  Red,  because  Blue 
may  detect  this  brief  maneuver  and  thereby  prepare  its  point  defenses 
against  the  incoming  missiles  themselves.  The  point  defense  may  or  may 
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not  be  effective,  given  an  opportunity  to  bring  it  into  play,  and  the 
number  of  hits  suffered  by  the  CV  will  vary  accordingly. 


From  this  sketch,  several  distinguishably  different  outcomes  may 
be  listed: 

(1)  There  is  no  detection  by  either  side  in  a 
reasonable  length  of  time 

(2)  The  Recon  is  lost  and  the  attack  aircraft  return 
home  empty  handed 

(3)  The  attack  aircraft  are  shot  down  before  launch  of 
their  missiles 

(4)  The  missiles  are  shot  down  by  the  CV's  point  defense 
system 

(5)  Missiles  impact  the  CV,  resulting  in  damage  or  sinking. 

The  payoff  structure  will  later  be  constructed  using  these  terminal 
conditions . 

A pictorial  form  of  the  scenario  is  shown  in  Figure  2,  it  is  based 
on  the  critical  events  of  the  scenario.  The  diagram  may  be  considered 
an  event-flow  diagram  or  precedence  relation  diagram.  It  should  not  be 
considered  a state  diagram  for  technical  reasons  to  be  discussed  later. 
The  diagram  consists  of  nodes  and  arcs  and  is  technically  a directed 
graph.  The  double-ellipses  denote  terminal  nodes. 

A convenient  way  to  relate  the  diagram  and  the  scenario  is  to 
consider  sample  paths  (i.e.,  sequences  of  nodes)  from  the  starting  node 
to  a terminal  node  as  possible  "plays"  of  the  scenario.  The  simplest 
possible  play  is  path  (1-12),  the  path  from  starting  node  1 to  terminal 
node  12,  representing  the  situation  in  which  neither  side  detects  the 
other.  Probably  the  next  simplest  (in  terms  of  the  amount  of  interaction 
between  the  Blue  and  Red  forces)  is  the  (1-2-9-15)  path.  This  path  may 
be  explained  as  follows:  Red  detects  Blue  at  a random  time  Ts , but  Blue 
either  does  not  detect  Red  (Ti  = 00  or  Ti  is  large  relative  to  T2). 
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Attack  aircraft  are  called  in  by  Recon,  the  raiders  run  out  from  their 


airbase  unopposed  and  launch  cruise  missiles  at  the  CV  at  time  Ts , The 
missiles  are  not  detected  by  the  CV,  and  therefore  impact  it,  ending  the 
play  of  the  game. 

A more  complex  set  of  interactions  is  given  by  the  path  (1-2-3-4- 
5-7-9-10-13).  Using  hypothetical  values  for  the  several  time  instants 
involved,  a typical  play  may  be  described  as  below.  All  times  are  in 
minutes,  measured  from  an  arbitrary  reference. 


| 


Node  Explanation 

1 Search.  Both  sides  search  according  to  their  selected 
search  mode. 

2 Detection:  The  CTF  detects  Recon  at  Ti  = 35 

Recon  detects  the  CTF  at  T2  = 40 
and  the  raiders  are  called  in. 

3 An  intercept  is  planned:  CAP  will  intercept  the 
Recon  at  T3  = 42. 

4 An  AEW  aircraft  detects  the  raid  at  T4  = 55 

(7  minutes  before  the  raid  plans  to  launch  its  cruise 
missiles  at  the  CV) . 

5 Recon  is  lost  at  T3  =42,  which  forces  the  raid  to 
either  pop-up  to  search  or  to  return  home. 

7 One  of  the  raiders  pops  up  at  T = 60.  The  CV  is 

detected  and  the  planned  missile  launch  time  is 
Ts  = 62. 

9 The  CV  does  not  detect  the  pop-up,  so  the  aircraft 

launch  their  missiles  at  T5  = 62. 

10  The  missiles  are  detected  by  the  CV  (in  spite  of  the 

CV's  missing  the  pop-up). 

13  The  missiles  are  shot  down  by  the  CV's  Point  Defense, 
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Ill  MODELING  APPROACH 


This  research  effort  has  been  devoted  to  finding  gaming  methods  of 
use  in  analyzing  Fleet  Defense  problems.  Game  theory  requires  a payoff 
structure,  and  payoff  in  a zero-sum  context  can  be  satisfactorily 
defined  in  terms  of  an  effectiveness  measure.  Thus  a necessary  step  in 
the  development  of  a gaming  methodology  is  the  development  of  an  effec- 
tiveness" model.  This  can,  of  course,  be  a large  problem  in  itself. 


A.  The  Effectiveness  Model 

The  approach  that  has  been  taken  is  to  first  design  an  effectiveness 
model  to  meet  two  basic  requirements: 

i)  The  model  should  be  an  adequate  effectiveness  model 
in  its  own  right 

ii)  The  model  should  be  structured  so  that  the  gaming 
model  can  be  readily  defined  in  terms  of  the 
effectiveness  model  elements. 

The  thrust  of  the  first  requirement  is  that  the  effectiveness  model 
should  have  as  parameters  all  of  the  decision  variables  of  both  players, 
and  when  these  parameters  are  fixed  the  model  should  "run"  as  an  effec- 
tiveness model,  giving  an  outcome  or  a probability  distribution  of 
outcomes  from  which  effectiveness  is  or  can  be  determined.  By  the 
second  requirement  we  imply  that  the  game  model  will  be  compatible  with 
the  effectiveness  model  so  that  the  two  will  form  an  integrated  whole. 

A design  that  will  not  be  used  is  one  in  which  the  effectiveness  model 
merely  provides,  in  a straightforward  manner,  elements  for  the  payoff 


Payoff  and  effectiveness  are  essentially  synonymous  in  this  document. 


■ 


matrix  of  a game  which  is  then  routinely  solved  separately.  Specifically, 
we  are  not  considering  an  OPSTA  I approach. 


I . Markov  Model 

With  this  general  approach  established,  preliminary  experimen- 
tation with  various  kinds  of  Fleet  Defense  situations  showed  the  desir- 
ability of  having  some  general,  overall  model  structure  to  work  with. 

Upon  briefly  surveying  the  possibilities  and  considering  the  various 
features  of  different  model  types,  it  was  decided  that  some  form  of 
Markov  model  should  be  attemptedj.  Elsewhere,  such  models  have  been 
found  to  be  very  flexible  and  adaptable  to  many  tactical  situations,  and 
mathematically  they  have  many  qualities  of  linearity  which  may  be  helpful. 
Furthermore,  there  is  widespread  knowledge  of  Markov  models  and  their 
properties,  and  definitions  are  both  well  known  and  standardized. 

Figure  2,  the  event  diagram  shown  earlier,  can  already  be 
considered  as  the  beginning  of  a Markov  model  by  considering  a node  to 
be  a state.  The  terminal  nodes  introduced  for  payoff  purposes  can  be 
regarded  as  absorbing  states.  Other  nodes  are  transient  states,  and  a 
play  of  the  game  corresponds  to  a Monte  Carlo  replication  of  a Markov 
model  from  the  starting  state  to  absorption.  The  absorption  probabil- 
ities are  needed  for  an  effectiveness  measure.  For  each  terminal  state 
t the  probability  of  absorption  pt  in  that  state  can  be  calculated.  If 
Vt  is  the  payoff  specified  as  input  for  terminal  state  t , the  average 
payoff  V is  defined  to  be: 


Pt  • Vt 


Further  experimentation  led  to  the  concept  of  the  gaming  model 
presented  in  this  technical  note:  some  form  of  Markov  model  would  first 
be  defined  as  above,  consisting  of  states  and  transition  probabilities. 


Transition  probabilities  would  then  be  worked  out  as  a function  of  the 
decision  variables  and  the  other  parameters  of  the  problem.  The  decision 
variables  would  be  used  to  play  the  game  after  suitable  decomposition. 
This  form  of  model  is  a version  of  what  is  technically  known  as  a Markov 
Stochastic  Game'"’. 

2.  Decision  Variables 

In  this  document  we  will  seldom  use  the  conventional  game 
theory  terms  strategy,  pure  strategy,  and  mixed  strategy.  We  prefer  to 
use  the  term  decision  variable  to  include  them  all;  a decision  variable 
for  Blue  (Red)  is  a variable  controllable  by  Blue  (Red),  often  subject 
to  constraints.  In  a constrained  matrix  game  context  experience  has 
shown  that  descriptions  are  more  natural  using  "decision  variable"  than 
those  made  in  terms  of  strategies. 

An  example  which  relates  the  two  definitions  is  in  an  ordinary 
matrix  game:  the  decision  variables  are  the  probabilities  of  selecting 
the  pure  strategies.  These  variables  are  constrained  to  be  non-negative 
and  to  sum  to  unity  for  each  side. 

There  are  four  types  of  states  classified  by  controllability 
of  the  transition  probabilities  on  outgoing  arcs.  Transition  probabil- 
ities out  of  some  states  may  be  controlled  only  by  Blue,  and  those  out 
of  other  states  controlled  only  by  Red.  These  two  types  of  state  will 
represent  one-sided  optimization  problems  and  can  be  called  "red- 
controlled  states"  or  "Blue-controlled  states."  The  more  vital  states 
are  those  whose  transition  probabilities  are  under  joint  control--these 
are  called  "game  states"  since  a game  will  be  played  at  each  of  them. 

A fourth  type  of  state  has  transition  probabilities  not  controllable  by 
either  Red  or  Blue,  these  are  "chance"  states. 


The  payoff  structure  in  such  games  is  richer  than  that  required  here--a 
payoff  can  also  be  specified  for  each  i-to-j  transition. 
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The  idea  then  is  that  Blue  should  choose  his  decision  variables 


in  such  a way  that  a play  of  the  game  will  tend  to  end  in  a terminal 
state  with  a relatively  large  payoff.  Red,  on  the  other  hand,  should 
choose  his  decision  variables  in  such  a way  that  the  opposite  happens-- 
termination  tends  to  occur  in  states  with  lower  values.  In  all  of  this 
it  is  understood  that  sufficient  mixing  (randomization)  of  strategies  is 
allowed  to  give  a true  min-max  solution  in  the  basic  game-theoretic 
sense . 

3.  The  State  Space 

So  far  we  have  the  concept  of  an  effectiveness  model  whose 
dynamics  derive  from  the  possible  paths  through  an  event-flow  diagram 
and  whose  measure  is  defined  in  terms  of  the  terminal  nodes  of  this 
diagram.  The  model  dynamics  need  to  be  considered  in  some  detail,  and 
this  entails  basic  considerations  about  constructing  a state  space.  The 
usual  Cartesian  product  method  of  defining  states  (choosing  all  combina- 
tions of  all  variables)  results  in  a state  space  which  grows  large 
multiplicatively . Many  or  even  most  of  the  states  defined  this  way  are 
either  meaningless  or  very  unlikely  to  be  used.  This  is  unsatisfactory 
in  the  present  context;  a better  way  is  needed. 

In  the  present  context,  a state  is  essentially  information 
which  is  sufficient  to  define  a game  problem.  If  Figure  2 were  a state 
diagram  in  which  nodes  were  states,  it  would  be  possible  to  completely 
define  a game  at  each  node.  However,  by  working  through  a few  situations 
using  Figures  1 and  2,  one  finds  that  the  node  labels  are  insufficient 
to  define  state.  More  information  is  needed  and  the  question  is  how  to 
systematically  and  conveniently  provide  it. 

One  convenient  way  to  add  information  is  to  consider  not  just 
where  the  flow  is  but  how  it  got  there,  i.e.,  to  consider  the  path  to 
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the  node  as  well  as  the  node  itself.  In  terms  of  the  elements  of 
Figure  2,  a state  could  be  defined  to  be  a path  from  the  starting  node 
(search)  to  some  other  node,  together  with  any  instants  of  time  Ti 
encountered  along  this  path.  Because  of  restrictions  inherent  in  the 
dynamic  programming  methodology  which  will  be  used  to  solve  the  game,  it 
would  also  be  necessary  to  restrict  the  paths  to  those  without  loops. 
That  is,  no  path  could  include  the  same  node  more  than  once. 

4.  Aggregation 

This  method  of  defining  state  is  quite  systematic  and  easy  to 
work  with.  However,  in  this  example,  only  about  one  fourth  to  one  half 
of  the  states  would  actually  be  needed  because  some  or  even  all  of  the 
early  path  data  becomes  irrelevant  as  the  game  proceeds.  To  reduce  the 
number  of  states  one  can  aggregate  two  or  more  paths  which  are  indistin- 
guishable from  the  game  standpoint  and  consider  the  result  an  aggregated 
state. 

As  an  example  of  aggregation,  consider  the  states  passing 
through  node  9 labelled  "Attack  Aircraft  Launch  Missiles  at  Ts."  Upon 

arrival  in  this  node,  the  launch  occurs  and  Blue's  problem  is  to  detect 

I 

and  shoot  down  the  missiles.  Quite  a number  of  events  which  occurred 
earlier  are  now  irrelevant,  in  particular  it  does  not  matter  whether 
node  5 (Intercept  Planned)  and/or  node  8 (Attackers  pop-up)  were  passed 
through  or  not.  Thus  paths  (1-2-3-9)  and  (1-2-3-5-7-8-9)  and  (1-2-3-5- 
7-9)  can  be  aggregated,  i.e.,  replaced  by  a path  denoted  (1-2-3-x-lO) , 
say.  Time  instant  (Tj)  information  would  be  aggregated  or  eliminated 
in  an  obvious  way  and  the  result  associated  with  the  aggregated  path  to 
form  a single  aggregated  state  to  replace  the  orginal  states. 

There  will  be  little  mention  of  path  or  state  aggregation  in 
the  sequel,  it  suffices  to  say  that  systematic,  programmable  methods  are 
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being  considered.  Unless  otherwise  indicated,  for  convenience  in  the 
rest  of  this  document  we  will  always  consider  a state  to  be  the  complete 
path  and  the  associated  times.  Appendix  A provides  more  detail  on  the 
geometry  implicit  in  Figure  2,  showing  a detailed  tree  diagram  and  using 
it  to  determine  the  number  of  states  and  the  distribution  of  path  lengths. 

5.  An  Example 

A simple  hypothetical  example  will  illustrate  much  of  what  has 
been  discussed,  and  will  furthermore  serve  to  introduce  some  of  the 
mathematical  ideas  to  come.  Figure  3 shows  a Markov  effectiveness  model 
example  with  three  transient  and  three  absorbing  states.  Because  the 
transition  probabilities  are  dependent  only  on  the  node  and  not  on  the 
path  we  may  equate  node  and  state,  i.e.,  aggregate  all  paths  to  a node 
into  a state,  and  identify  the  node  with  the  state.  Payoffs  (to  Blue, 
who  maximizes)  are  3,  -2,  and  10  for  terminating  in  states  4,  5,  and 
6,  respectively,  as  shown  on  the  diagram.  Each  transient  state  can  make 
transitions  to  three  other  states  as  indicated  by  the  directed  arcs. 
Algebraic  formulas  are  given  on  the  arcs  for  transition  probabilities; 
for  example,  from  state  1 (the  starting  state)  the  probability  of  making 
a transition  to  state  4 is  2 'tc4 , to  state  3 the  transition  probability 
is  y4,  and  to  state  2 it  is  (l-2*x4  - y4 ) . The  Xj  and  the  yi  are  the 
decision  variables  for  Blue  and  Red,  respectively. 

By  summing  the  transition  probabilities  out  of  each  state  one 
sees  that  they  add  to  one,  as  required.  However,  further  constraints 
have  to  be  added  to  make  the  probabilities  each  lie  in  the  interval  zero 
to  unity.  Constraints  selected  for  this  purpose  are: 
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.2  £ xi , x2  £ .6 


0 ^ yi  , yo  £ .8 
.1  £ y3  s .3 
0 £ x3  < .4 
0 £ x4  =s  .3 
0 5 y4  £ .4 

With  these  constraints  we  have  a well-defined  game  with  a min-max 
solution  in  the  decision  variables  xi  - X4  and  yi  - y^ . 

Solution  of  the  entire  game  without  decomposition  appears  to 
be  quite  difficult  but  solution  by  dynamic  programming  is  fairly  direct. 
We  proceed  by  working  backwards  from  states  whose  value  is  already 
determined,  and  at  this  point  only  the  terminal  states  have  their  values 
determined  (they  are  the  payoffs  3,  -2,  and  10).  Since  the  diagram  is 
loopless  there  has  to  be  at  least  one  other  state  which  can  transition 
only  to  evaluated  states,  here  it  is  state  3.  Thus  we  set  up  a con- 
strained matrix  game  at  state  3 in  which  Blue  selects  Xi  and  X2  and  Red 
selects  yi  and  y2  subject  to  their  constraints.  Letting  * denote 
"optimal,"  the  solution  to  this  game  is  x”  = x*  = .2,  y*  = y*  = 0,  and 
the  value  of  the  game  is  V*  = -2.  (This  value  is  the  mean  payoff  in  a 
game  started  from  state  3,  when  both  players  play  optimally  from  state  3 
until  the  end  of  the  game.)  State  3 is  now  considered  evaluated,  and 

the  value  V*  may  be  associated  with  state  3 making  it,  in  effect,  a 

terminal  state.  A second  game  is  played  at  state  2;  Blue  chooses 
x*  = .4  and  Red  chooses  y*  = .1  for  optimal  play,  with  value  V-'*  = 5.2. 

Thus  if  state  1 were  removed  from  the  problem  entirely  and  the  game 


played  with  state  2 as  starting  state,  the  value  of  the  game  would  be 
5.2.  The  state  2 game,  it  should  be  noted,  could  not  have  been  played 
before  the  state  3 game  because  the  value  at  node  3 was  unknown. 


The  final  step  is  the  solution  of  the  game  at  state  1,  which 
is  played  in  the  same  manner.  For  optimal  play  xj'  = 0 and  y*  = .4,  with 
value  V’{  = 2.32,  Since  state  1 is  the  actual  starting  state,  the  value 
of  the  entire  game  is  2.32  overall,  the  probabilities  of  absorption  into 
states  (4,  5,  and  6)  are  (0,  .64,  .36).  The  saddle  point  property  states 
that  Red  cannot  reduce  the  mean  payoff  below  2.32  if  Blue  uses  the 
optimal  x‘i  listed  above  for  each  game,  a similar  statement  holds  for 
Blue  when  Red  plays  optimally. 

B . The  Game  Model 

The  previous  section  discussed  the  form  of  the  Markov  effectiveness 
model,  with  states  tentatively  defined  as  loopless  paths  to  nodes  and 
with  absorbing  states  designed  so  that  a satisfactory  effectiveness 
measure  could  be  defined.  Transition  probabilities  are  often  functions 
of  the  decision  variables  of  both  Blue  and  Red,  although  they  may  on 
occasion  be  functions  of  Red’s  variables  only,  or  Blue's  variables  only, 
or  even  be  independent  of  both  Red's  and  Blue's  variables.  As  important 
as  it  is,  the  effectiveness  model  is  auxiliary  to  the  gaming  model  to  be 
discussed  next. 

The  fundamental  elements  of  this  Markov  model  (like  those  of  any 
Markov  model)  are  the  states,  and  it  has  been  pointed  out  that  games  are 
theoretically  played  at  the  state  level.  However,  the  structure  of  this 
problem  will  permit  us  to  play  games  at  the  nodal  level  instead.  Thus, 
a single  game  can  be  defined  at  a node  to  represent  the  entire  family 
of  states  at  the  node,  with  the  parameters  of  these  states  regarded  as 
parameters  of  the  nodal  game.  For  purposes  of  clarity,  however,  the 
discussion  will  assume  that  games  are  played  at  states. 
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Dynamic  Programming 


The  overall  game  has  already  been  described:  Blue  wants  to 
maximize  the  average  payoff  and  Red  wants  to  minimize  it,  where  the 
average  payoff  is  the  weighted  average  of  the  input  utility  values  at  the 
absorbing  states.  This  game  will  generally  be  far  too  large  to  solve  as 
a unit,  however.  Therefore  decomposition  of  some  kind  is  required,  and 
we  have  chosen  a Markov  chain  model  for  this  purpose.  Subgames  can  then 
be  played  at  the  level  of  a state  and  their  solutions  linked  together  by 
dynamic  programming  to  solve  the  overall  game*. 


A finite  Markov  model  without  loops  is  an  almost  ideal  candi- 
date for  solution  by  dynamic  programming.  The  central  idea  in  the 
dynamic  programming  approach,  working  backwards  from  already  evaluated 
states,  has  already  been  shown  by  example.  The  goal  is  to  find,  for 
each  state,  the  game  value  at  that  state  and  the  optimal  strategies  at 
that  state.  The  game  values  will  "propagate"  through  the  solution 
because  the  solution  at  a state  depends  upon  the  game  values  at  already 
evaluated  states.  Optimal  decision  variables  do  not  propagate  in  this 
way.  Therefore,  to  form  an  uncluttered  geometrical  image  of  the  dynamic 
programming  process,  one  can  visualize  recording  the  game  values,  as 
they  are  found  recursively,  near  the  state  symbol  on  the  state  diagram. 
This  recording  serves  also  to  indicate  that  this  state  has  been 
"evaluated."  Absorbing  states,  with  their  values  specified  as  inputs, 
are  by  definition  "already  evaluated.  ' "Working  backwards"  is  not 
usually  uniquely  defined,  since  there  is  usually  more  than  one  way  to 
choose  the  ordering  of  states.  To  start  the  process,  one  scans  the 


Other  methods  are  probably  available  as  well,  in  general  they  use 
either  value  iteration  or  policy  iteration  methods.  Here  we  discuss 
only  the  dynamic  programming  approach,  others  may  be  found  in  Pollatschek 
and  Avi-Itzhak's  Algorithms  For  Stochastic  Games  With  Geometrical 
Interpretation,  Management  Science,  Vol.  15,  No.  7,  (March  1969). 
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state  diagram,  looking  for  some  state  which  can  transition  only  to 
already-evaluated  (i.e.,  absorbing)  states.  Because  the  state  diagram 
is  loopless  such  a state  can  always  be  found.  Suitable  calculations  are 
made  at  that  state  according  to  its  type.  If  it  is  a game  state,  the 
game  will  be  solved  by  whatever  method  is  required;  ordinary  matrix  games 
and  constrained  matrix  games  are  expected  to  suffice  most  instances. 

Both  of  these  types  of  games  are  soluble  by  Linear  Programs.  If  the 
state  has  its  outgoing  transition  probabilities  controllable  by  only  one 
player,  a one-sided  optimization  is  performed  and  the  resulting  value 
recorded  as  the  game  value.  If  neither  player  has  control  of  the  tran- 
sition probabilities,  the  calculation  reduces  to  a straightforward 
probability  calculation  without  optimization.  In  any  case  this  state 
joins  the  "already  evaluated"  list  and  the  search  begins  for  another 
state  which  has  transitions  only  to  already  evaluated  states.  The 
process  continues  until  all  states  have  been  evaluated. 

Somewhat  more  formally,  let  x and  y be  the  decision  vectors 

s s 

for  Blue  and  Red  at  state  s , and  let  p (x  ,y  ) be  the  transition 

s , e s s 

probabilities  from  s to  evaluated  states  e . Let  V be  the  value  at 

e 

state  e which  has  already  been  found  and  recorded.  At  node  s the 
payoff  function  is 


v (x  ,y  ) 

s s s 


= Y 


P (*,y) 

s ,e 


e 

and  the  game  is  to  min-max  this  expression.  The  result  is  a pair  of 

optimal  decision  vectors  x"  and  y*,  and  a game  value  V".  A constrained 

s s s 

matrix  game  is  used  whenever  x and/or  y must  satisfy  additional  linear 

s s 

constraints  Ax  = a aud/or  By  = b. 

s s 


I 


21 


. 


2 . Extensions 

This  relatively  simple  structure  will  be  refined  and  extended 
when  applied  to  the  Fleet  Defense  model.  Briefly,  there  are  two  major 
modifications  needed: 

• Iteration  of  the  entire  process  will  be  required 
due  to  the  way  uncertainty  will  be  modeled. 

• Decision  variables  which  occur  at  more  than  one 
state  will  probably  have  to  be  introduced  intc 
state  itself.  (Decision  variables  may  not  be 
associated  with  nodes  in  the  convenient  way  they 
were  in  the  example.  One  would  like  to  have 
decision  variables  distributed  as  they  were  in 
the  example--i.e. , occurring  at  a unique  state, 
li/hen  this  is  the  case,  a decision  variable  may 
be  evaluated  as  soon  as  it  is  encountered  in  the 
dynamic  programming  algorithm;  when  this  is  not 
the  case,  a multiple  solution  has  to  be  carried 
along. ) 

Both  of  these  items  are  discussed  further  in  later  examples. 

The  impact  of  these  modifications  is,  as  always,  on  model  size 
and/or  running  time.  Iteration  only  influences  run  time,  total  time  will 
be  roughly  proportional  to  the  number  of  iterations.  Carrying  along 
multiple  solutions,  on  the  other  hand,  will  influence  both  run  time  and 
model  size.  Fortunately,  neither  factor  is  expected  to  limit  the  feasi- 
bility of  the  overall  approach. 
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IV  FLEET  DEFENSE  MODEL 

We  now  return  to  the  primary  purpose  of  this  research  and  begin  to 
further  develop  a Fleet  Defense  Model.  The  event-flow  diagram  presented 
earlier  as  Figure  2 is  the  basis  for  the  real  world  problem  and  the 
previous  section  is  the  basis  for  the  methodology. 

A.  A Partial  Summary 

A summary  will  be  made  at  this  point  to  unify  the  operational 
elements  and  mathematical  ideas  comprising  the  model.  Figure  1 shows 
the  geometry  of  the  scenario  and  involves  the  CV,  Recon,  CAP,  airfield, 
attack  aircraft,  and  cruise  missiles  launched  by  the  attack  aircraft. 
Figure  2 has  nodes  representing  events  or  conditions,  and  a path  may  be 
traced  from  the  starting  node  (search)  to  any  of  five  terminal  nodes, 
to  represent  one  play  of  the  game  or  one  realization  of  the  scenario. 

A utility  (or  payoff)  is  associated  with  each  terminal  node;  Blue's 
objective  is  to  maximize  the  average  value  of  utility  (or  payoff)  and 
Red's  objective  is  to  minimize  it;  this  is  where  the  zero-sum  assumption 
is  made. 

A state  is  tentatively  defined  to  be  a path  to  a node,  i.e.,  it  is 
an  ordered  sequence  of  nodes.  Any  timing  information  picked  up  along 
the  path  is  carried  along  with  it  to  complete  the  state  definition. 

Thus,  1-2-3-5  and  (Ti  = 30,  Tg  = infinity,  T3  = 40)  is  a state;  the 
corresponding  problem  defined  for  that  state  is  to  maximize  payoff  given 
that  the  Task  Force  detected  Recon  at  Ti  =30,  the  Recon  did  not  detect 
the  Task  Force,  and  the  Recon  was  lost  at  T3  = 40.  Present  time  is 
considered  to  be  T = T3  , since  by  convention  the  node  label  applies  at 
the  instant  the  node  is  entered. 
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c. 


In  terms  of  its  states,  the  primary  parameters  of  the  model  are 
transition  probabilities,  which  depend  in  general  upon  the  decision 
vectors  of  both  Red  and  Blue,  the  state,  and  the  Ti  associated  with  the 
state. 

For  any  given  values  of  the  decision  vectors  x and  y , the 
probabilities  of  absorption  in  each  terminal  state  can  be  calculated, 
and  from  these  the  average  payoff  determined.  Blue  chooses  x in  order 
to  influence  the  flow  to  terminate  (on  the  average)  in  a state  with 
relatively  high  payoff,  while  Red  chooses  y in  order  to  influence  the 
flow  to  terminate  in  states  with  low  payoff.  Components  of  the  decision 
vectors  may  be  physical  quantities  and/or  probabilities,  in  particular 
some  of  them  may  be  the  probabilities  of  selecting  from  the  defined 
decisions  at  the  state. 

B.  General  Observations 

Several  general  observations  should  be  made  about  the  event-flow 
diagram  from  which  the  Markov  model  is  defined.  First,  it  is  described 
as  though  there  were  just  one  unit  of  each  type  (one  AEW  aircraft,  one 
CAP  interceptor,  etc.).  This  has  been  done  for  simplicity,  multiple 
units  can  be  added  later.  Similarly,  various  outcomes  are  considered 
to  have  only  two  possibilities:  the  Recon  is  lost  or  it  is  not,  the 
attack  aircraft  detect  the  CV  or  they  do  not,  and  missiles  either  impact 
the  CV  or  the  missiles  are  shot  down  by  point  defense.  It  should  be 
fairly  obvious  that  more  refined  outcomes  can  be  added  once  the  basic 
structure  is  finished.  Another  way  in  which  outcomes  can  be  refined  is 
to  use  the  "path  memory"  that  a state  may  have.  That  is,  because  the 
absorbing  states  may  remember  the  paths  taken  to  reach  them,  the  utili- 
ties Vi  specified  as  inputs  may  be  made  a function  of  path  as  well  as 
terminal  node.  As  a special  case  of  this  a utility  term  may  be  defined 
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at  an  important  event  at  an  interior  (transient)  node.  This  payoff 
is  analogous  to  the  stagewise  payoff  ht(x,y)  in  N-stage  games  analysis. 

< 

The  named  times  (Ti  through  T© ) are  needed  for  the  model's  logic 
and  decisions  based  on  them  are  similarly  treated  in  a black-white 
manner.  If,  for  example,  the  Recon  is  lost  an  instant  before  the  attack 
aircraft  are  due  to  launch  (i.e.,  if  T3  < Ts),  then  the  attacker  is 

forced  to  either  pop-up  to  search  or  return  home  without  contact.  These 

I 

somewhat  unrealistic  features  can  be  improved  at  some  cost  in  terms  of 
complexity  at  a later  time. 

Another  important  thing  to  point  out  about  Figure  2 is  that  it  is 
described  from  the  "true"  point  of  view.  Equivalently,  we  might  say 
that  at  this  point  the  game  is  played  with  perfect  information  for  both 
players.  In  particular,  each  side  knows  whether  it  has  been  detected 
and  what  time  the  detection  occurred.  In  a later  section  when  uncertainty 
is  considered,  the  perfect  information  assumption  will  be  removed.  How- 
ever, the  flow  of  the  model  should  still  be  considered  to  be  flow  of  the 
true  situation,  and  not  the  estimate  that  Blue  or  Red  has  of  the  situation. 
Uncertainty  will  be  handled  on  a node-by-node  basis. 

Some  remarks  about  timing  are  also  needed.  We  wish  to  have  the 
effectiveness  model  structured  so  that  transitions  in  the  model  occur  in 
a time  sequence,  that  is,  if  a path  goes  from  node  A to  node  B the  time 
of  entry  of  A should  jj^f'be  earlier  than  the  time  of  entry  of  B.  With 
the  exception  of  node  2 this  holds  true  on  Figure  2- -for  node  2 the 
times  Ti  and  Ts  can  have  any  relation  to  each  other,  and  Ts  may  be  less 
than  T3  (the  time  node  3 is  entered).  Again,  Ti  (the  time  the  Task  Force 
detects  the  Recon)  may  be  later  than  T4  (the  time  the  AEW  aircraft 
detects  the  raid).  Once  beyond  node  2,  however,  the  timing  is  preserved. 
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Decision  Variables 


This  section  discusses  decisions,  decision  variables,  and  their 
probabilities  for  both  Red  and  Blue.  A decision  variable  is  essentially 
a controllable  variable,  it  may  or  may  not  be  in  one-to-one  correspon- 
dence with  decision  in  the  usual  sense  of  the  word.  The  term  strategy 
as  used  in  game  theory  is  implicitly  defined  here  in  terms  of  a collection 
of  decision  variables.  Indeed,  the  specification  of  a value  for  each 
decision  variable  constitutes  a strategy". 


1 . Specification  of  Probabilities 

Many  of  the  important  decision  variables  are  binary  (yes/no) 
and  represent  individual  decisions.  A convention  has  been  adopted 
regarding  notation:  Blue's  discrete  decision  variables  are  lower  case 
b's,  and  Red's  are  lower  case  r's.  Continuous  variables  use  other  nota- 
tion. For  example,  the  Blue  variable  bi  is  defined  by: 

bi  = 0 if  CAP  will  not  be  used 

bi  = 1 if  CAP  will  be  used. 


This  usage  is  for  pure  strategies  only.  More  generally,  we  allow  the 
"mixing"  of  this  decision  by  introducing  probabilities  as  decision 
variables : 


Prob(bi  = 0)  is  the  probability  that  CAP  will  not  he  used 


Prob(bi  = 1)  is  the  probability  that  CAP  will  be  used. 


Alternatively  and  preferably,  it  is  also  possible  to  introduce  bi  as  a 
component  of  a joint  probability.  For  example,  the  decision  as  to  whether 
the  CV  should  be  initially  active  (decision  variable  bs)  may  be  considered 
jointly  with  the  CAP  decision.  If  b3  is  defined  to  be  zero  for  the  CV 

ic 

Actually  a behavioral  strategy. 
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initially  in  EMCON  and  unity  for  the  CV  initially  active,  we  may  consider 


the  four  probabilities: 


Prob (bi  =0,  b2  = 0) 


Prob(bi  = 0,  b2  = 1) 


Prob(bi  = 1,  b2  = 0) 


Prob(bi  = 1,  b2  = 1) 


Lol^ 


V 1 

le/9- 


as  decision  variables  instead  of  P(bj  = 0),  P(bi  =1),  P(b2  = 0),  P(b2  = 1). 

Choosing  between  these  two  formulations  (independent  specifi- 
cations vs,  joint  specifications)  is  not  simply  a matter  of  taste.  The 
two  formulations  are  essentially  those  of  behavioral  strategies  vs. 
adaptive  strategies  as  discussed  in  Reference  1.  It  can  be  shown  that 
the  simply  independent  specification  is  often  equivalent  to  the  more 
complex  joint  specification.  In  any  case,  the  joint  specification  is 
often  ruled  out  from  problem  size  considerations.  For  example,  if  there 
are  ten  decision  variables  with  five  levels  each,  the  independent 
approach  requires  10  x 5 = 50  probabilities  and  the  joint  approach  re- 
quires 10  x 10  x 10  x 10  x 10  = 100,000  probabilities.  For  this  research 
effort  no  hard  and  fast  rule  can  be  given  for  the  form  of  probability 
specification.  The  principal  guiding  rule  is  one  of  pragmatism,  joint 
specifications  will  be  used  to  the  extent  that  there  is  room  for  them, 
and  when  they  are  used  it  will  be  for  those  variables  which  most  require 
joint  specification. 


2.  Node-by-Node  Discussion 


A node-by-node  discussion  of  decisions  and  decision  variables 
will  now  be  undertaken.  A symbol  will  be  introduced  for  decision  variables 
which  are  reasonably  well-defined.  We  occasionally  identify  decisions 
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and  considerations  from  which  further  decisions  could  be  defined  in  a 
more  thorough  study. 


Node  1 : Search 

This  phase  will  terminate  in  either  No  Detections  or  in  one  or 
more  detections  at  times  Ti , T2  as  shown  on  Figure  2.  Ti  and  T2  are  ran- 
dom variables  selected  from  distributions  in  part  determined  by  decision 
variables.  They  may  be  considered  fixed  for  the  remainder  of  the  flow. 
Important  Blue  decisions  at  this  node  will  continue  to  exert  an  influence 
over  the  problem  flow  in  later  states.  Three  such  decisions  are: 

b 1 : will  Blue  employ  CAP  and  AEW? 

b2  : will  the  CV  be  initially  in  EMCON,  or  active? 

b3 : will  the  AEW  aircraft  search  passively,  or  actively? 

Note  that  b3  is  conditioned  on  b2  : if  there  are  no  AEW  aircraft,  then 
b:,  is  irrelevant. 


Red's  major  decision  is  also  binary: 

ri : will  Red  search  passively,  or  actively? 

We  are  here  implicitly  considering  a single  search  plan  for  the  Red 
Recon,  a more  refined  analysis  might  consider  allowing  Red  to  choose 
one  of  several  search  plans  as  an  option.  Another  refinement  requiring 
further  Red  decisions  relates  to  the  criterion  for  calling  out  the  Red 
attack  aircraft,  should  Red  Recon  do  this  on  an  ESM  contact  alone  or 
should  he  wait  for  radar  contact  and  more  positive  identification? 


Node  2:  Detection 


Entry  into  this  node  implies  that  Blue  has  detected  Red 
(at  Ti),  or  Red  has  detected  Blue  (at  T2),  or  each  has  detected  the 
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other  (at  Ti , Ts , respectively).  It  is  assumed  that  Red  automatically 
calls  up  the  attack  aircraft  at  To,  whatever  its  relation  to  Ti . Blue's 
major  decision  is: 

t>4  : will  the  CAP  fighter  attempt  to  intercept  the  Recon? 

Two  Red  decisions  are: 

r2:  will  the  Recon  retire  from  the  scene  following  the 
communications  to  the  attack  aircraft? 

T3 : will  the  Recon  retire  from  the  area  when  he  discovers 
that  he  is  being  intercepted,  even  if  he  is  needed  by 
the  incoming  raid? 

These  three  decisions  would  probably  be  based  on  considerations  not  yet 
fully  defined.  From  Blue's  viewpoint,  using  the  interceptor  to  attack 
the  Recon  reduces  the  defense  capability  against  the  Red  raid  (when  it 
comes);  besides,  Blue  is  not  sure  of  whether  the  Recon  is  actually 
required  by  the  raid  once  initial  CV  position  is  determined  and  relayed. 
Red's  decisions  revolve  around  some  of  the  same  considerations:  if  he 
is  not  actually  necessary  he  may  linger  around  as  a decoy,  at  some  risk 
of  being  shot  down.  If  Red  Recon  is  necessary  for  the  mission,  he  still 
will  not  choose  to  stay  if  he  feels  that  he  will  be  shot  down  with  near 
certainty  before  the  launch  is  consummated. 

In  any  event,  node  2 is  exited  to  the  node  determined  by  the 
earliest  of  the  following: 

i)  An  intercept  is  planned  for  CAP  vs.  the  Recon, 
with  estimated  intercept  time  T3 

ii)  The  raid  is  detected  by  AEW  aircraft  at  T4 

iii)  Attack  Aircraft  Launch  their  Missiles  at  T5. 


Node  3:  Intercept  is  Planned  for  Time  Ta 
(CAP  Intercepts  Recon) 
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Entry  into  this  node  is  at  the  time  the  planning  decision  took 
place,  which  may  be  taken  to  be  time  T*  plus  an  appropriate  delay.  As 
indicated  on  Figure  2,  the  raid  may  or  may  not  have  already  been  detected 
by  the  AEW  at  time  T4. 

This  is  an  example  of  a node  that  is  not  a "game  node," 
decisions  already  made  together  with  various  assumptions  will  determine 
the  next  node.  Basically,  there  are  three  competing  times:  T4 , T3 , and 
T5 . If  T3  (the  intercept)  occurs  first,  the  next  node  is  Attack  Aircraft 
Launch  Missiles  (node  9)  if  the  intercept  fails  and  Recon  is  Lost  (node  5) 
if  the  intercept  succeeds.  If  T4  occurs  before  T5  and  both  are  before  T3 , 
the  transition  is  to  Raid  Detected  By  AEW  at  T4  (node  4)  or  Attack  Air- 
craft Launch  Missiles,  respectively. 

Node  4:  Raid  Detected  by  AEW  at  T4 

It  is  understood  that  this  node  can  be  entered  only  when  AEW 
aircraft  are  selected  by  Blue;  this  would  be  implemented  by  making 
transition  probabilities  on  arcs  into  this  node  equal  to  zero.  This  node 
is  an  example  of  another  kind  of  nongame  node  because  Blue  has  decisions 
to  make  while  Red  does  not.  Instead  of  a game  involving  two-sided 
optimization  by  Blue  and  Red  we  have  one-sided  optimization  by  Blue  alone. 

Blue  decides  whether  the  CV  should  go  active  if  it  is  now 
passive  (bf;  = 1 means  yes,  0 means  no).  Blue  also  decides  whether  to 
launch  interceptors  from  the  deck  for  an  intercept  at  time  Ts  (decision 
variable  be ) , and  whether  or  not  to  divert  the  CAP  aircraft  on  intercept 
of  the  Recon  to  intercept  the  raid  instead  (decision  variable  b7 ) . Blue 
should  consider  these  three  binary  decisions  jointly  (there  are  23  = 8 
cases)  and  maximize  the  node  4 payoff  function  defined  in  terms  of  these 
decisions  . 
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Node  5:  Reconnaissance  is  Lost  at  T3 


The  Recon  is  considered  lost  when  it  is  either  successfully 
intercepted  or  decides  to  leave  the  area,  the  latter  being  permitted  by 
the  earlier  variables  ra  and  ry.  In  either  case  the  time  is  T3 . 

Both  Blue  and  Red  can  be  assumed  to  know  of  the  occurrence  of 
this  event,  and  potentially  there  is  an  opportunity  for  making  new 
decisions  whose  outcome  depended  upon  the  Recon  being  lost.  For  Red  the 
problem  is  more  critical  if  the  Recon  was  actually  necessary  for  success 
of  the  attack  mission  as  we  assume  here.  It  has  been  assumed  that  the 
attackers  fly  at  low  altitude  as  long  as  possible,  and  loss  of  Recon 
implies  loss  of  information  on  the  CV. 

The  only  decision  defined  for  this  node  at  present  is  Red 
decision  r4 : 

r4  = 0 if  the  attackers  return  home  without  contact 
(node  12) 

r4  = 1 if  the  attackers  elect  to  continue  the  mission, 
implying  that  one  or  more  of  them  must  pop-up 
to  search  for  the  CV. 

Node  6:  DLI  Launched  to  Intercept  Raid  at  Te 

The  launch  decision  was  made  at  node  4,  intercept  time  is  Ts. 
Node  6 may  have  been  entered  directly  from  node  4 or  indirectly  by  the 
Recon  is  Lost-Attackers  popup-CV  detected  popup  path.  The  submodel 
at  this  node  must  determine  transition  probabilities  for  node  9 
(Attack  Aircraft  Launch)  and  node  14  (Attack  Aircraft  Destroyed  Before 
Launch).  The  transition  out  of  node  6 depends  on  whether  the  Tg 
is  earlier  than  T6 . For  particular  times  T5  and  T6  this  is  simple  to 
resolve.  However,  T=  and  Te  are  random  variables,  and  the  transition 
probability  from  node  6 to  node  9 is  actually  Prob(T5  ^ Ts). 
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Nodes  7 and  8:  Attackers  Pop-up  to  Search  for  the  CV  and  CV 
Detected  Pop-up 

These  are  combined  and  treated  in  detail  in  the  next  section, 
using  decision  variables  D for  Red  and  bs  for  Blue,  where  D = distance 
from  planned  launch  point  to  pop-up  point  and  b8  = 0 if  the  CV  remains 
in  EMCON,  b8  = 1 if  the  CV  goes  active. 


Node  9:  Attack  Aircraft  Launch  Missiles  at  Ts 

This  node  is  probably  the  most  critical  of  all,  for  arrival  in 
this  node  implies  that  the  launch  of  ASCMs  occurs.  The  transition  may 
have  been  made  from  any  of  the  nodes  2,  3,  4,  6,  7,  or  8.  Transitions 
out  of  node  9 are  consistent  with  others  on  the  diagram  in  their  black- 
white  character:  either  the  missiles  are  detected  by  the  CV  or  the 
missiles  impact  the  CV. 

Blue  may  be  assumed  to  know  of  the  launch  if  the  raid  had  been 
detected  by  AEW  aircraft.  Blue  decisions  are  dependent  upon  whether  the 
launch  was  detected;  if  detected,  Blue  decides  whether  to  divert  CAP  or 
DLI  from  any  other  intercept  mission  to  the  missiles  (decision  variable 
b3)  and  whether  to  employ  countermeasures  such  as  chaff  (decision 
variable  bio).  Certain  Red  decisions  (which  would  actually  be  made  much 
earlier  in  the  real  world)  may  be  considered  made  at  this  point  in  the 
model;  in  particular  those  controllable  settings,  thresholds,  etc., 
which  determine  the  missiles'  flight  profile  and  homing  mode  may  be 
considered  as  node  9 decisions. 


There  will  be  uncertainty  on  both  sides  in  a fully  formulated 
game  at  this  node.  Blue  will  be  uncertain  as  to  whether  the  Red  attack 
aircraft  are  needed  by  the  missiles  after  launch.  Red  will  be  uncertain 
about  those  aspects  of  the  problem  that  actually  determine  the  optimal 
settings  and  thresholds  but  which  are  unknown  at  the  time  the  settings 
are  actually  made. 
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Node  10:  Missiles  are  Detected  by  CV 


In  this  node  Blue  has  decisions  which  relate  to  the  use  of 
chaff  or  other  countermeasures,  whether  or  not  to  divert  CAP  or  DLI  now 
on  intercept  mission,  and  whether  to  employ  DLI  against  the  missiles. 
These  decisions  have  been  mentioned  before  for  earlier  nodes,  further 
analysis  is  needed  to  determine  what  node  or  nodes  they  should  be  in. 
Similarly  for  Red's  decisions,  the  control  settings  may  be  more  appro- 
priate in  this  node  than  in  node  9. 

Nodes  11-15:  Terminal  Nodes 

These  are  terminal  nodes  defined  primarily  for  purposes  of 
defining  the  payoff  structure;  there  are  no  decisions  made  here. 


V A DETAILED  MODEL  AT  A GAME  NODE 


This  section  considers  a particular  node  in  some  detail  in  order  to 
demonstrate  the  methodology.  Actually,  a pair  of  nodes  (7  and  8)  are 
involved  since  it  turns  out  to  be  analytically  convenient  to  combine 
these  two  nodes  into  one  which  we  will  simply  call  node  l' . The  new 
node  l'  then  can  be  entered  only  from  node  5 while  transitions  can  be 
made  to  nodes  6,  9,  or  12,  according  to  Figure  2. 

Node  7 ' defines  several  states,  one  for  each  path  from  node  1 to 
node  7 ' together  with  the  times  Ti  associated  with  these  paths.  From 
this  auxiliary  information  (path  and  times)  one  can  construct  a family 
of  geometrical  situations,  an  example  of  which  is  shown  in  Figure  4. 
Relative  to  some  time  reference  not  shown,  Ti  and  T2  mark  the  respective 
positions  of  the  Recon  at  times  Ti  and  T2 , while  Ts  and  T4  mark  the 
position  of  the  raid  at  times  T3  and  T4 , respectively.  Missile  launch 
will  occur  at  time  T5  in  the  position  shown  if  the  attack  aircraft  are 
not  shot  down  before  reaching  this  point. 

Also  known  in  this  node  is  a variable  e , which  makes  its 
appearance  for  the  first  time.  We  may  think  of  e as  a variable 
initially  set  equal  to  the  Blue  decision  variable  b2  (specifying  will  the 
CV  be  active  initially?)  and  reset  according  to  the  other  bj  which  have 
to  do  with  the  later  passive-to-active  transition  possibilities.  Although 
e does  not  appear  in  Figure  2,  it  would  be  possible  to  introduce  it 
there  by  adding  nodes  where  the  Blue  decisions  to  go  active  can  be  made, 
and  providing  two  branches  out  of  these  nodes  corresponding  to  "remain 
passive"  (e  = 0)  and  "go  active"  (e  = 1).  The  value  of  e will  then  be 
known  because  it  is  part  of  the  path  memory  at  a state. 
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A. 


Monte  Carlo  Flow  Diagram 


Transition  models  associated  with  nodes  can  be  expressed  in  many 
ways.  It  turns  out  to  be  convenient  to  define  the  transition  model  at 
node  l'  in  the  form  of  a flow-chart  for  a simulation  model.  Equations 
will  then  be  derived  for  payoff  functions  directly  from  the  flow-chart, 
and  the  node  l‘  game  then  defined  using  these  payoff  functions  to 
determine  the  payoff  entries  for  an  ordinary  matrix  game. 

Figure  5 shows  a flow-chart  for  the  transition  model  of  node  7 ' , 
the  aggregated  node  composed  of  nodes  7 and  8 in  the  original  nodal 
diagram.  The  flow-chart  is  presented  as  though  it  were  for  a subroutine 
fitting  into  a larger  computer  program  representing  a Monte  Carlo 
simulation  of  the  entire  effectiveness  model.  Program  flow  is  therefore 
in  the  forward  (increasing  time)  direction,  the  opposite  of  the  dynamic 
programming  algorithm.  Inputs  are  shown  at  the  top  of  the  figure  for 
the  Monte  Carlo  subroutine  interpretation,  while  the  output  for  a given 
entry  is  the  node  to  which  flow  is  transferred  at  the  exit.  For  conve- 
nience the  payoff  values  (which  are  specified  for  terminal  nodes  and 
derived  by  dynamic  programming  for  the  internal  nodes)  are  shown  at  the 
exits,  as  though  these  payoffs  were  "picked  up"  at  the  time  of  exit. 

In  outline,  the  flow  is:  find  the  time  of  weapon  launch  (Ts ) and 
the  time  DLI  intercepts  the  raid  (T6)  from  the  detection  times  Ts  and 
T4.  These  calculations  will  involve  geometry  from  Figure  4 as  well  as 
the  detection  times.  Because  the  decision  has  been  made  to  pop-up 
before  entering  this  node,  Fed  must  decide  when  to  pop-up.  This  timing 
decision  is  defined  implicitly  in  terms  of  a pop-up  distance  D,  measured 
from  the  position  of  the  previously  planned  launch  point.  The  time 
corresponding  to  distance  D is  denoted  by  T(D) , where  T(*)  is  a known 
function.  For  a chosen  D,  the  question  "Is  Te  s£  T(D)?"  means  "Does  the 
DLI  intercept  the  raid  before  pop-up  time?" 


If  the  answer  is  "yes," 


Che  transition  is  to  node  14  for  payoff  Vj.4,  otherwise  the  flow  continues 
with  a detection  check:  "Is  the  CV  detected  by  the  attack  aircraft 
popping  up  at  distance  D from  the  CV?"  Absence  of  detection,  which  has 

probability  1 - pi (D) , implies  another  transition  to  terminal  node  12  and 

payoff  Vip . After  deciding  whether  the  CV  has  been  detected,  the  other 
detection  question  is  asked:  "Does  the  CV  detect  the  aircraft  doing  the 
pop-up?"  For  detection,  it  is  assumed  necessary  that  the  CV  must  be 
active  upon  entry  into  node  l'  or  go  active  because  the  Blue  decision 
variable  b8  specifies  it.  When  the  CV  is  active,  the  probability  of 
detection  is  assumed  to  be  a known  function  p2(D).  In  a Monte  Carlo 
sense  the  "yes"  decision  would  be  made  a fraction  p2(D)  of  the  time,  the 
"no"  decision  a fraction  (1  - ps(D))  of  the  time.  In  either  case  the 

transition  is  to  node  9,  with  its  derived  payoffs  V9(0)  and  V9(l)  for  the 

(no  detection  of  raid  by  CV,  detection  of  raid  by  CV)  cases,  respectively. 


B.  Solution  of  the  Node  Game 
— 

1.  Exit  Probabilities 
— 

The  Monte  Carlo  model  whose  flow  diagram  is  shown  on  Figure  5 
need  never  be  implemented  as  such.  Instead,  a Markov  model  equivalent  to 
Figure  5 is  used,  from  it  the  four  probabilities  of  exit  (to  nodes  12,  14, 
and  9 with  and  without  CV  detection)  determined  analytically.  It  rs  not 
necessary  to  actually  draw  this  equivalent  diagram  because  the  probabil- 
ities needed  are  already  shown  at  the  branch  points  in  Figure  5.  The 
general  rule  needed  is:  the  probability  of  exiting  at  a node  with  payoff 
V*  is  a sum  over  all  paths  from  the  entry  point  to  node  x . Each  term 
of  the  sum  is  the  product  of  the  probabilities  on  branches  along  the  path, 
i.e.,  the  probability  of  the  path. 

Using  this  rule  the  probability  of  transferring  to  terminal 
node  12  (p  .12  ) is: 

P.ia  = P(Te  > T(D) )[ 1 - Pl(D)]  . (1) 


Similarly,  the  probability  of  transferring  to  terminal  node  14  (p.14)  is: 


P • 1 4 = P(T6  £ T(D) ) . (2) 

Continuing  with  the  two  probabilities  6f  transferring  to  node  9 
P(transfer  to  9 without  CV  detecting  raid)  = p.90 

P(TS  > T(D) ) • pi(D)  * {(l-e)[p(bs  = 0)  + P(be  = l)fe (D)]  + e p2 (D)j  (3) 

where  a bar  over  a quantity  signifies  unity  minus  the  quantity.  Finally, 
since  the  four  probabilities  of  transition  out  of  the  node  must  sum  to 
one,  the  probability  of  transferring  to  node  9 with  CV  detection  of  the 
raid  (p.91 ) is: 


P.91  - 1 - ( P • 1 2 + P ■ 1 4 + P.90  ) 


(4) 


2 . Payoff  Function 


Using  formulas  (l)-(4),  the  expected  value  of  the  payoff  from 
node  7 onwards  is: 

(0)  cn 

V7'  (Ta,T4,e;  D,bs)  = p.12  x V12  4-  p.i4  * Vi4  + p.90  x V9  + p.91  x V9  . (5) 

In  equation  (5),  the  Vq0)  and  Vq15  terms  on  the  right  are  functions  of 
Ta,  T4,  and  e just  as  the  value  term  on  the  left.  Equation  (5)  is  the 
payof f function  for  this  node.  As  it  stands,  V7'  depends  upon  its  five 
arguments  T2 , T4 , e;  D,  and  b8 . The  last  two  (D  and  b8 ) are  decision 
variables  which  will  be  eliminated  by  the  min-max  operation. 

3 . Value  Function 

The  value  function  at  stage  l'  results  from  playing  a zero-sum, 
two-person  game  using  the  payoff  function  (5)  to  determine  elements  in 
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the  payoff  matrix.  The  value  function  will  be  a function  of  the  contin- 
uous variables  T;>  and  T4  and  the  binary  variable  e . Since  the  game  to 
be  formulated  in  the  next  paragraph  is  an  ordinary  matrix  game,  it  will 
not  include  T2  and  T4  as  explicit  parameters  in  the  answer.  It  will 
therefore  be  necessary  to  solve  the  node  l'  game  at  least  eight  times. 
(Two  levels  would  be  required  for  the  binary  variable  e , and  at  least 
two  levels  each  would  be  independently  required  by  variables  T2  and  T4). 
Linear  interpolation  would  be  used  in  the  variables  T2  and  T4,  and 
higher  order  polynomials  requiring  additional  game  solutions  would  be 
used  only  if  required. 


4.  Game  Matrix 

The  payoff  function  in  equation  (5)  can  be  written  compactly 
as  a function  of  D and  of  the  probabilities  of  ba  as  shown  in  equation 
(6): 


V7'  = f i (D)  + f2(D)  • P(b8  = 0)  + f 3 (D)  • P(be  = 1)  . (6) 

This  formula  may  be  used  to  display  the  game  matrix  for  a discrete  game 
approximating  the  continuous  one  when  D can  take  on  only  discrete  levels 
Di,  Ds , ...,  Dn-  Table  1 shows  the  game  matrix. 


The  solution  of  the  game  in  Table  1 yields  a value  V7'  , where  the 
asterisk  means  "associated  with  the  game  solution."  It  is  the  value  of 
the  game  that  is  played  starting  from  node  7 in  the  original  structure, 
i.e.,  it  is  the  average  payoff  to  Blue  if  both  Red  and  Blue  play  optimal 
strategies  from  node  7 on.  Optimal  decision  variables  (often  in  the 
form  of  mixed  strategies)  are  also  game  outputs  which  can  be  thought  of 
as  associated  with  the  node;  they  are  functions  of  the  same  variables  as 
the  value.  In  the  Monte  Carlo  interpretation  these  variables  would  be 
used  when  the  flow  of  the  model  arrived  in  node  l' . When  needed,  random 
numbers  would  be  selected  in  accordance  with  the  optimal  probabilities 
and  used  to  make  the  decisions  which  determine  the  next  node. 


. 
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VI  MODELING  UNCERTAINTY 

One  vital  element  is  missing  above,  and  that  is  the  recognition 
and  treatment  of  uncertainty  in  the  real-world  problem.  Even  in  this 
relatively  simple  scenario  there  are  several  important  places  where 

I 
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uncertainty  occurs: 

i)  The  Recon  does  not  know  when  or  whether  it  has  been 
detected  by  the  AEW  aircraft,  or  by  the  CV's  radars 

ii)  The  CV  does  not  know  when  or  whether  it  has  been 
detected  by  the  Recon 

iii)  The  Recon  does  not  know  whether  the  CV  has  planned 
an  attack  on  him,  given  detection. 

It  is  sometimes  difficult  to  know  whether  a particular  matter  which 
is  uncertain  in  the  ordinary  sense  of  the  word  needs  to  be  handled  as 
uncertainty  in  the  model.  A way  to  proceed  is  to  first  define  all  the 
submodels  for  the  several  nodes  in  the  game,  assuming  perfect  information, 
and  then  examine  them  one  at  a time  and  ask:  "Is  the  information  assumed 
by  the  respective  players  actually  available  to  each  at  this  point?" 

When  the  information  used  by  a party  is  in  fact  unknown,  it  then  becomes 
a candidate  for  uncertainty  modeling.  The  way  we  will  proceed  is  by 
example,  relegating  the  more  general  statement  and  model  of  uncertainty 
to  Appendix  B.  Since  node  7 ' is  the  only  node  which  has  been  described 
in  detail,  it  is  the  node  at  which  to  look  for  uncertainty. 

If  one  examines  the  earlier  Figure  5,  which  shows  the  flow  diagram 
for  node  l' , one  finds  that  it  has  been  implicitly  assumed  that  Ts , the 
time  the  DLI  will  intercept  the  raid,  is  known  to  Red  as  well  as  to  Blue. 
Since  Te  is  calculated  from  T4  and  geometry,  this  implies  that  Red  knows 
T4,  the  time  that  the  AEW  aircraft  detects  the  raid.  This  is  not  a 
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realistic  assumption  and  we  will  now  develop  an  uncertainty  model  to 
remedy  the  situation.  Note  the  asymmetry  here--Blue  has  perfect  infor- 
mation"' while  Red  has  imperfect  information  (i.e.,  uncertainty).  The 
discussion  below  applies  to  this  situation  only,  while  Appendix  B is 
more  general  and  treats  the  symmetrical  case  as  well. 

A.  Probability  Distribution  Approach 

It  will  turn  out  that  a satisfactory  theory  can  be  developed  if  it 
is  assumed  that  Red  knows  not  T4  itself  but  the  probability  distribution 
of  T4.  Red  will  act  as  though  the  problem  itself  is  random  by  playing 
into  the  probability  distribution  while  Blue  will  respond  in  kind  by 
incorporating  into  his  own  solution  the  knowledge  of  how  Red  will  play. 

Thus  Blue  will  use  the  same  probability  distribution  in  arriving  at  his 
own  optimal  solution.  (Blue  actually  determines  several  solutions, 
conditioned  on  the  true  value  of  T4 . However,  he  uses  only  one  of  them, 
the  one  corresponding  to  the  actual  T4 , in  a given  play  or  Monte  Carlo 
replication. ) 

The  mathematics  which  incorporates  this  probability  distribution 
and  the  assumptions  associated  with  it  is  reasonably  straightforward 
and  will  be  presented  shortly.  Justifying  the  assumptions,  or  at  least 
making  them  plausible,  is  a more  difficult  task  than  handling  the  mathe- 
matics. One  way  to  try  to  do  this  is  to  consider  another,  quite  different, 
"scenario." 

Imagine  that  there  are  two  analysts,  a Blue  analyst  and  a Red  analyst, 
whose  aim  is  to  find  optimal  strategies  for  their  respective  military 
forces  in  the  very  scenario  this  document  considers.  They  may  realize 
early  that  a good  deal  of  modeling  will  have  to  be  done,  and  it  may  be 
possible  to  use  each  other's  models.  Indeed,  going  to  the  extreme  in 
commonality,  they  may  be  able  to  use  the  same  model  providing  uncertainty 
is  properly  treated. 

★ 

This  is  not  technically  correct  without  translating  the  structure  into 
an  equivalent  game. 


44 


Suppose,  for  now,  that  Red  uses  an  arbitrary  estimate  of  the 
probability  distribution  of  T4 . Suppose  further  that  the  gaming  model 
has  been  completed  and  is  ready  to  use.  The  analysts  may  run  it  for 
optimal  decision  variables  (or  strategies),  then  build  a Monte  Carlo 
version  which  employs  these  optimal  strategies.  They  can  then  run  the 
Monte  Carlo  model  and  derive  various  probability  distributions--in 
particular,  they  can  derive  the  probability  distribution  of  T4 . This 
distribution  could  then  replace  the  estimate  Red  started  with,  and  the 
process  repeated  again  and  again  until  convergence. 

So  far,  this  entire  process  appears  to  be  just  another  iterative 
process  of  a type  that  arises  frequently  in  applied  Operations  Research. 
The  Blue  and  Red  analysts  have  to  now  examine  it  and  see  if  one  is 
taking  unfair  advantage  of  the  other.  Since  Red  is  controlling  the 
distribution,  it  appears  to  be  he  who  is  somehow  getting  the  better  of 
the  situation,  if  either  is.  What  sort  of  complaint  could  Blue  register 
about  the  process  being  biased  against  him? 

With  the  possible  exception  of  complications  due  to  the  introduction 
of  uncertainty,  Blue  knows  that  they  are  involved  in  a zero-sum,  two- 
person  game.  One  of  the  more  outstanding  properties  of  the  solutions  to 
such  games  is  that  neither  side  can  exploit  the  optimal  strategies  of 
the  other,  i.e.,  each  side  can  give  the  other  his  own  optimal  strategy 
and  lose  nothing  by  it.  Therefore,  in  this  two-analyst  context,  the 
Blue  analyst  cannot  lose  anything  by  giving  the  Red  analyst  the  optimal 
Blue  strategy.  (In  fact,  these  strategies  were  already  needed  above  to 
run  the  Monte  Carlo  model  to  obtain  the  distribution  of  concern.)  There- 
fore, there  seems  to  be  no  legitimate  complaint  Blue  can  raise  about  bias. 

It  would  lead  too  far  afield  to  pursue  this  matter  further  in  this 
document.  The  critical  point  is  that  the  postulated  common  model  that 
was  considered  above  really  seems  to  exist--if  shipboard  facilities  were 
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available  for  Blue  to  actually  put  the  model  aboard  ship  he  would  have 
no  incentive  to  change  it  after  he  and  the  Red  analyst  are  finished  with 
their  joint  venture.  (A  more  formal  related  discussion  is  to  be  found 
in  Reference  2 (Harsanyi).  Harsanyi's  context  is  n-person,  non-zero 
sum  games  involving  Nash  equilibrium  points  and  considerable  game  theory 
background  is  required.) 

B . Mathematical  Formulation 

Turning  to  the  mathematical  formulation  for  the  example,  assume 
that  f pk } is  a discrete  estimate  of  the  probability  distribution  of  T4 ; 
specifically,  let 

Pk  = Prob(T4  = T4k)  ) for  k = 1,2,...,K  , (7) 

where  the  T4  are  the  approximating  levels  of  T4 . Then,  according  to 
the  model  in  Appendix  B,  the  game  matrix  in  Table  1 has  to  be  duplicated 
K times  and  the  kth  duplicate  scaled  by  pk . These  scaled  duplicates  are 
then  stacked  as  shown  in  Table  2 to  form  a game  matrix.  The  top  block 
is  Table  1 scaled  by  pl5  with  the  Blue  decision  variable  b8  superscripted 
by  "1"  to  correspond  to  k = 1.  The  column  structure  (for  Red's  decision 
variables)  is  the  same  as  it  was  in  Table  1,  reflecting  a lack  of  choice 
for  Red,  while  the  duplication  of  blocks  in  the  row  structure  reflects 
Blue's  multiplicity  of  choices. 

Denoting  the  entire  game  matrix  in  Table  2 by  P,  the  Linear 
Programming  tableau  can  be  constructed  (Table  3).  In  game  theory  par- 
lance the  problem  has  now  assumed  the  form  of  a constrained  matrix  game . 

The  upper  left  corner  of  the  tableau  is  the  payoff  matrix  P from  Table  2. 
The  row  of  ones  below  P expresses  the  probability  constraint  imposed  on 
Red  : 

n 

£ P(D  = Dj  ) = 1 . 

J = i 
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LINEAR  PROGRAMMING  TABLEAU  FOR  TABLE  2 
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■<— Obj  Function 

Corresponding  to  this  row  are  several  columns  of  ones  expressing  Blue's 
probability  constraints: 

P(bCeK)  = 0)  + P(b£k5)  = 1)  = 1 for  k = 1,2,...,K 

The  bottom  row  expresses  the  objective  function,  which  is  to  be 
maximized : 

z - £ 1 • T . 

k = 1 


Defining  3j  to  be  P(D  = Dj  ) , the  equation  form  of  the  primal  (Red) 
problem  is  to  choose  3j  0 and  any  satisfying: 


II 

2 PijBj  >i;  i 0 i - 1,2 2K 


7 


to  maximize 


where  P - (pi3)  and  i'  = i/2  if  i is  even  and  i'  = (i+l)/2  if  i is 


The  optimal  primal  solution  to  this  constrained  matrix  game  is  a set 


of  values  for  Red: 


3l  , Pa,  • • • > 3n 


and  the  multipliers 


*i*.  ....  X* 


The  optimal  dual  (Blue)  solution  comes  as  a by-product  of  the  primal 
solution.  Letting 


a k ~ Prob(bg  = 0)  and  1 • ok  h ak  = Prob(bgk)  =1)  , 


the  optimal  Blue  solution  is: 


» &*,  •••,  Ck'k"  and  the  multiplier  p* 


The  value  of  the  game  is  - Z* , where  Z*  is  the  optimal  value  of  the 
objective  function  output  by  the  linear  program. 
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This  solution  is  used  in  different  ways,  depending  upon  whether  we 
are  interested  in  the  use  in  the  game  solution  (by  dynamic  programming) 
or  in  the  Monte  Carlo  interpretation  of  the  model.  In  the  Monte  Carlo 
interpretation  Red  will  use  the  mixed  strategy  dictated  by  the 
Prob  (D  = Dj ) whatever  the  true  value  of  T4  and  Blue  will  use  a mixed 
strategy  for  the  true  value  (known  to  Blue,  but  not  to  Red)  of  T4. 
(Interpolation  may  often  be  needed  because  in  general  the  true  T4  will 
not  equal  one  of  the  finite  set  [T4k)  }). 

In  the  gaming  solution  the  usage  is  different;  instead  of  just  one 
pair  of  values  being  used  the  entire  solution  is  used.  For  the  kth 
value  of  T4  the  value  of  the  node  7 ' game  played  when  Blue  has  T4k)  is: 

n 

V*00  s £ P j Pk  [( f a (°j  ) + fi(D,))a*  + ( f 3 (Dj ) + fi  (Dj ))  (1  - a*)]  (11) 

j -i 

This  is  simply  the  usual  sum-of -double-products  formula  for  the  game 
value  in  terms  of  the  strategies  and  the  payoff  matrix  elements,  modified 
by  restricting  the  row  player's  variables  to  the  kth  block.  Expressing 
this  in  terms  of  the  earlier  notation  for  value  (test  following  Table  1) 
the  game  value  at  its  arguments  is: 

V*  (Ta.TtVe)  = V^i°*(T3,e)  (12) 

where  the  dependence  of  the  quantity  on  the  right  on  Tg  and  e has  been 
made  explicit. 
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Appendix  A 

A DETAILED  DIAGRAM  OF  THE  FLEET  DEFENSE  MODEL 

This  appendix  is  devoted  to  further  discussion  of  the  Markov  model 
derivable  from  Figure  2.  Figure  A-l  shows  a tree  derived  from  Figure  2 
by  tracing  through  the  diagram  from  node  to  node,  avoiding  paths  which 
enter  the  same  node  more  than  once.  For  clarity,  only  node  numbers  are 
shown.  The  main  part  of  the  graph  is  traced  downwards  to  end  in  either 
a terminal  node  or  in  node  9,  which  is  outside  the  loop.  Node  9 and 
its  successors  are  shown  separately  on  the  bottom  of  the  page,  this 
constellation  would  replace  each  node  9 in  the  main  graph  above  if  the 
graph  were  done  in  full. 

Since  the  graph  in  Figure  A-l  is  a tree,  there  is  a unique  path 
from  the  circle  representing  node  1 to  any  other  circle.  Therefore,  the 
circles  can  be  identified  with  states  as  defined  in  the  text.  By  simply 
counting  the  number  of  circles  with  a given  node  number,  the  number  of 
states  associated  with  each  node  in  Figure  2 can  be  determined;  the 
results  are  in  Table  A-l. 

Because  the  length  of  a path  is  related  in  some  sense  to  the 
complexity  of  the  state,  the  distribution  of  path  lengths  is  of  some 
interest.  Table  A-2  shows  the  distribution  derived  from  Figure  A-l. 

Path  length  is  defined  to  be  the  number  of  nodes  appearing  in  its 
definition;  e.g.,  path  1-2-3-5  has  length  four. 
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Table  A-l 


NUMBER  OF  STATES  AT  A NODE 


Table  A-2 


PATH  LENGTH  DISTRIBUTION 


Appendix  B 

A METHOD  FOR  INCORPORATING  UNCERTAINTY 
WITH  CONSTRAINED  MATRIX  GAMES 
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Appendix  B 

A METHOD  FOR  INCORPORATING  UNCERTAINTY 
WITH  CONSTRAINED  MATRIX  GAMES 

This  appendix  contains  a more  general  treatment  of  the  uncertainty 
model  introduced  by  example  in  the  last  section  of  the  text.  The  theory 
is  closely  related  to  Harsanyi's  more  general  model  (References  2,  3,  4) 
which  applies  to  n-person,  non-zero  sum  games  using  Nash  equilibrium 
points.  (Two-person,  zero-sum  games  are  used  as  examples  in  the  second 
of  these  references,  and  the  theory  appears  to  be  almost  identical  to 
that  expressed  here.  A difference  is  that  Harsanyi  uses  the  normal  form 
of  the  game  for  solution,  which  implies  joint  strategies,  while  this 
appendix  uses  a constrained  matrix  game  approach  with  independent 
strategies.)  Harsanyi's  set  of  papers  require  considerable  background 
in  game  theory.  An  elementary  example  using  the  theory  appears  in 
Reference  5;  a game  in  the  Anti-Ballistic  Missile  field  was  formulated 
to  incorporate  uncertainty  in  the  number  of  defensive  interceptors. 

The  basic  idea  is  simple:  reduce  the  game  with  incomplete  infor- 
mation to  one  with  complete  information  by  introducing  a lottery  which 
decides  which  of  the  several  possible  games  will  be  played  on  a given 
trial.  In  the  most  general  case,  neither  player  knows  the  lottery  out- 
come (i.e.,  knows  the  game  being  played).  However,  they  do  know  the 
probability  distribution  governing  the  lottery  and  in  addition  they  may 
have  partial  information  about  the  lottery  outcome. 

We  will  proceed  in  steps  rather  than  going  directly  to  the  most 
general  symmetrical  case.  Suppose  there  are  two  subgames  whose  payoff 
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matrices  Pi  and  P2  are  shown  in  Table  B-l.  Suppose  further  that  a 
lottery  will  select  game  1 with  probability  pi  = 1/3  and  game  2 with 
probability  p2  = 2/3. 

(g) 

As  shown,  the  probability  that  Blue  will  choose  column  1 is  X] 

(g)  (g) 

and  x2  (=  1 - xi  ) is  the  probability  that  Blue  will  choose  column  2 
in  game  g . Similarly,  yi  and  y2  are  Red's  probabilities  of  selecting 
rows  1 and  2.  For  later  reference,  solutions  of  the  games  are  shown 
when  both  Red  and  Blue  know  the  game  being  played  (game  1 has  a pure 
strategy  solution  x^15  = y2  = 1 with  value  Vf  = 0.  Game  2 played  in  the 
usual  way  has  a mixed  strategy  solution  x\s)  = 3/8,  x2E) = 5/8,  yi  = 7/8, 
y2  = 1/8  with  value  V*  = 3/8. 

Now  assume  that  Blue  knows  the  outcome  of  the  lottery,  i.e.,  knows 
which  game  is  being  played,  while  Red  knows  only  the  probabilities 
Pi  and  p2 , 


Table  B-l 

PAYOFF  MATRICES  Px  AND  P2  FOR  TWO  SUBGAMES 

Game  1 (pi  = 1/3)  Game  2 (p2  = 2/3) 


xSn  = 0 

xj1’  = 1 

yi  = 0 

0 

2 

ye  = 1 

- 1 

0* 

xj2)  = 3/8 

x225  = 5/8 

M 

II 

00 

1 

0 

ya  = 1/8 

- 4 

3 

(~  saddlepoint;  value  zero)  (Value  3/8) 
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Adopting  Red's  viewpoint,  the  average  payoff  V as  a function  of  his 
probabilities  yi  and  y2  is  found  by  conditioning  on  the  subgame  and 
summing : 

2 

V = ) E (V | game  = g)  P(game  = g) 


yTPix{1>  x j) 

+ 

(y'r=*‘a> 

«!) 

-T(|  PlXC1>  + 

2 

3 

Pax“>) 

• 

(B-l) 

Equation  (B-l)  is  the  payoff  function  for  a new  "lottery"  game  whose 
payoff  matrix  is  shown  in  Table  B-2, 


Optimal  solutions’^  to  the  lottery  game  in  Table  B-2  are  xi  1 = 0, 

Xpl>  * = 1,  x{c)  * = 1/4,  X22>  * = 3/4  for  Blue  and  y*  = 7/8,  y*  = 1/8 

for  Red.  The  game  value  is  V*  = 5/6,  which  is  larger  than  the  average 

payoff  to  Blue  from  the  two  games  played  separately  with  perfect  infor- 
mation. (That  average  is  1/3  x 0 + 2/3  x 3/8  = 1/4.)  Therefore,  as 
expected,  Blue  gains  when  Red  has  only  probabilistic  knowledge  of  the 
game  being  played. 


ic 

Found  by  a constrained  matrix  game. 


Table  B-2 

PAYOFF  MATRIX  FOR  LOTTERY  GAME 


„ (1) 

Cl) 

, C2) 

(.2) 

Xl 

X? 

Xl 

XE 

y i 

0 

2/3 

2/3 

0 

ya 

- 1/3 

0 

- 8/3 

2 
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Optimal  strategies  for  the  two  situations  may  be  compared  to  see 
what  compromises  have  been  made. 

• In  subgame  1 Blue  uses  the  same  pure  strategy  as  in 
the  perfect  information  (PI)  game. 

• In  subgame  2 Blue  shifts  somewhat  (from  3/8  probability 
to  1/4  probability  for  column  1). 

• Red's  optimal  strategy  is  (coincidentally)  identical 
to  his  optimal  strategy  for  subgame  2,  a reflection  in 
part  of  the  higher  probability  assigned  to  this  game  in 
the  lottery. 

• The  "average  PI  strategy"  for  Red,  defined  to  be  the 
average  of  the  perfect  information  strategies  weighted 
by  the  lottery  probabilities,  works  out  to  be 

(yi ,y2)  = (7/12,5/12),  which  is  far  from  the  optimal 
y*  = (7/8, 1/8). 

A more  general  and  symmetrical  formulation  of  a game  model  involving 
uncertainty  is  one  in  which  each  side  has  some  one  thing  (such  as  a 
resource  level)  known  to  it  and  not  to  the  other  side.  Let  ai  j be  the 
probability  that  the  lottery  will  assign  resource  level  i to  Red  and 
resource  level  j to  Blue.  Both  Red  and  Blue  will  know  the  a,  3 and  for 
each  given  realization  of  the  game  Red  will  know  i but  not  j and  Blue 
will  know  j but  not  i . Let  Pt j be  the  payoff  matrix  for  the  i,j 
case;  both  Blue  and  Red  know  the  Pi 3 as  well  as  the  aij.  (Pij  is  the 
game  matrix  for  the  perfect  information  case  in  which  Red  has  i , Blue 
has  j , and  both  know  i and  j . ) 

The  payoff  matrix  for  this  more  general  lottery  game  has  a form 
which  can  be  anticipated  from  the  simpler  case  above.  It  is  shown  in 
Table  B-3. 

The  xU)  are  vectors  of  probabilities  to  be  played  if  Blue's 
resource  level  is  j . Similarly,  the  ycn  shown  by  the  rows  are  vectors 
of  probabilities  to  be  played  by  Red.  Individual  elements  aijPij  are 
submatrices  of  the  total  matrix,  where  the  k,f.th  element  of  the  submatrix 
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Table  B-3 


A MORE  GENERAL  LOTTERY  GAME 


xclJ 

xC2) 

(n) 

X 

al  1 Pi  1 

a!2  Pi  2 

3i  n Pi  n 

3^  i ^ 

* * * 

a2n  Pgn 

ami  Pm  l 

Pm? 

. • • 

Pffin 

is  a4 j times  the  k,^th  element  of  Pi j . Such  a payoff  matrix  will  be  used 

in  a constrained  matrix  game  formulation  for  solution  by  linear  program- 

ming in  the  following  example. 

Example 

A modification  of  the  well-known  Berkovitz  Dresher  N-stage  game 
(Reference  6)  illustrates  the  methodology  for  the  symmetrical  case. 
Consider  a four  stage  game  (N=4)  in  which  the  number  of  aircraft  possessed 
by  Red  (Blue)  is  uncertain  to  Blue  (Red)  in  the  first  stage  only.  (During 

the  first  stage,  which  may  correspond  to  the  start  of  a war,  each  side  is 

assumed  to  be  able  to  assess  the  other  side's  resources.)  Uncertainty  is 
assumed  symmetrical;  each  side  assumes  the  other  has  either  75%,,  1007., 
or  125%  of  the  actual  number  of  aircraft.  That  is,  if  Red  actually  has 
100  aircraft  and  Blue  actually  has  200  aircraft,  then  Blue  will  assume 
that  Red  has  either  75,  100,  or  125  aircraft,  and  Red  will  assume  that 
Blue  has  150,  200,  or  250  aircraft.  The  three  levels  for  each  side  are 
assumed  equally  likely,  and  independent.  Therefore,  the  3x3=9  values 
of  aij  all  equal  1/9.  Table  B-4  summarizes  the  joint  probability  distri- 
bution (aij ) and  associated  numbers  of  aircraft. 
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Table  B-4 

LOTTERY  JOINT  PROBABILITIES 
AND  INITIAL  NUMBERS  OF  AIRCRAFT 


Bl 

150 

Blue 

200 

250 

75 

1/9 

1/9 

1/9 

Red  100 

1/9 

1/9 

1/9 

125 

1/9 

1/9 

1/9 

In  this  model  aircraft  are  assigned  in  some  numbers  to  three 
missions  (Counter-Air,  Air  Defense,  and  Ground  Support)  in  such  a way  as 
to  maximize  (minimize)  for  Blue  (Red)  the  difference  between  the  total 
Blue  ground  support  and  the  total  Red  ground  support.  Symbolically,  let 
(xs , us , mj ) be  the  number  of  Blue  aircraft  assigned  to  CA,  AD,  and  GS 
missions,  respectively,  for  stage  i . Similarly,  (yi  , Wi  , ni  ) are 
defined  for  Red.  The  payoff  function  is: 
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pi  (and  qi ) denote  the  number  of  Blue  (and  Red)  aircraft  at  the  beginning 
of  stage  i . The  payoff  function  then  becomes: 

V = y mi  - y ni  = N(pi-x,-ui)  - ^(qi  -yt -w,  ) 

because  pi  = xi  + ut  + mi  and  qt  = yi  + wi  + nt  . 

Attrition  is  calculated  using  the  expressions: 

Pi-i  = (Pi  - (yi  -ut  )+  )+ 

qi-i  = (qi  - (xi-w-  )+ )* 
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where  "+"  means  "positive  part."  Reading  the  pj_i  expression  from  the 
inside  out,  it  says  "first  reduce  the  number  of  attackers  yj  by  one  for 


each  Blue  defender,  the  result  (yj -nj )*  is  the  number  of  Red  penetrators. 
Then  reduce  the  number  of  Blue  aircraft  by  one  for  each  Red  penetrator," 
the  result  is  the  number  of  Blue  aircraft  available  at  the  beginning  of 
stage  i-1. 


From  the  solution  of  the  original  problem  it  is  known  that  for 
three  stages  the  value  of  the  game  is  (3  x.  number  of  Blue  aircraft  - 
3 x number  of  Red  aircraft).  This  is  sufficient  information  to  formulate 
a payoff  matrix  for  four  stages  when  the  pure  strategies  for  the  fourth 
stage  are  given;  a payoff  entry  is: 

(P4-X4-U4)  - (q4-y4-w4)  + 3pa  - 3pa 


where  Blue  chooses  x4  , u4  , titi  and  Red  chooses  y4  , w4  , ru  . 

The  solution  to  be  presented  will  be  an  approximation  based  on  a 
small  number  of  predetermined  pure  strategies  for  each  side.  Optimal 
strategies,  which  are  not  known,  are  probably  not  included  in  these 
sets'.  Table  B-5  shows  Blue's  pure  strategies.  The  columns  correspond 
to  the  Blue  resource  level  assumed:  150,  200,  or  250,  There  are  eight 
rows  and  three  columns  in  each  matrix,  corresponding  to  eight  pure 
strategies  for  each  of  the  three  resource  levels.  For  example,  the 
first  Blue  strategy  for  the  Blue  resource  level  of  200  is  X4  = 150 
(left  table,  first  row  and  second  column),  U4  = 50  (right  table,  first 
row  and  second  column),  with  the  excess  going  to  Ground  Support: 
mi  = 200-150-50  = 0.  Red's  strategies  were  defined  by  similar  table. 


"fc 

However,  the  game  value  comes  out  to  be  quite  close  to  the  optimal, 
see  final  paragraph. 
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Table  B-5 


L 


BLUE  STRATEGIES 


X Values 

p4  = 150  200  250 


75 

150 

175 

150 

200 

150 

70 

150 

125 

80 

125 

100 

50 

10C 

150 

75 

100 

125 

100 

125 

100 

75 

150 

150 

U Values 

p4  = 150  200  250 


75 

50 

75 

0 

0 

75 

80 

0 

75 

70 

0 

75 

50 

50 

75 

0 

25 

50 

0 

25 

50 

50 

25 

50 

The  game  value  for  the  perfect  information  solution  to  the  p4  = 200, 
q4  = 100  case  is: 


V"  = 4.5(p4-q4)  = 450 

as  shown  in  Table  II  of  Reference  6.  Several  values  can  be  defined  for 
the  uncertainty  game.  The  game  value  for  the  situation  in  which  the 
resource  levels  are  truly  distributed  according  to  the  ai j is  435, 
showing  a slight  advantage  to  the  weaker  player  (Red).  The  uncertainty 
apparently  hurts  Red  less  than  it  hurts  Blue.  On  the  other  hand,  if  the 
resource  levels  actually  are  200  for  Blue  and  100  for  Red  the  game  value 
works  out  to  be  402  (about  897.  of  the  perfect  information  value)  which 
favors  Red  even  more. 

Optimal  strategies  and  their  probabilities  for  this  lottery  game 
are  given  in  Table  B-6  as  a function  of  actual  number  of  aircraft. 
Probabilities  for  the  lottery  game  are  shown  adjacent  to  the  allocations. 
The  last  column  on  the  right  gives  the  optimal  perfect  information 
strategies  from  the  original  game  when  the  resource  levels  are  200  for 
Blue,  100  for  Red. 
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Table  B-6 


OPTIMAL  STRATEGIES 


Number 

of  Aircraft 

D 

U4 

ni4 

Lottery  Game 
Probab ility 

Perfect  Information 
Game  Probability 

Blue 

150 

80 

70 

0 

0.682 

100 

0 

50 

0.318 

200 

100 

50 

50 

0.227 

0 

150 

50 

0 

0.318 

1 

200 

0 

0 

0.455 

0 

250 

175 

75 

0 

1.000 

Red 

y* 

w4 

Oi 

— 

— 

75 

75 

0 

0 

0.182 

0 

75 

0 

0.409 

0 

0 

75 

0.409 

100 

100 

0 

0 

0.909 

0.50 

0 

100 

0 

0.091 

0.50 

125 

0 

125 

0 

1.0 

Some  numerical  work  was  done  to  partially  bound  tha  error  due  to 
the  approximation  of  the  strategy  space.  Red  was  allowed  to  optimize 
into  the  fixed  Blue  strategies  (i.e.,  allocations  and  probabilities) 
shown  in  Table  B-6.  Red  optimized  separately  for  each  of  the  three 
resource  levels  75,  100,  and  125.  In  all  three  cases  the  payoff  could 
not  be  reduced  below  the  game  value,  showing  that  one  half  of  the 
approximation  has  zero  error.  The  separate  game  values  were: 
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(R) 

75 

s 534 

for 

Q = 75 

,(R) 

100 

= 445 

for 

Q = 100 

,(R) 

125 

= 326 

for 

Q = 125 

The  average  of  these  three  values  is  the  game  value  435  mentioned 

(R) 

earlier.  The  Vv  terms  are  the  game  values  when  Blue  has  resource 

Q 

P = 150,  200,  and  250  with  equal  probability,  Red  has  Q,  and  both  use 
strategies  from  Table  B-6. 

Allowing  Blue  to  optimize  similarly  into  Red  resulted  in: 

V150  ' 235 

VM0  - 435 

'£  - 

whose  average  is  only  0.2*  higher  than  the  game  value  found  by  using  the 
approximating  strategies.  In  other  words,  the  bounds  on  value  provided 
by  the  two  one-sided  optimizations  show  that  the  game  value  is  in  error 
by  only  0.2  in  435. 


Additional  significant  figures  are  needed  to  validate  this. 
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